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ABSTRACT 

We investigate rotational spin noise (referred to as timing noise) in non-accreting pulsars: millisecond 
pulsars, canonical pulsars, and magnetars. Particular attention is placed on quantifying the strength 
and non-stationarity of timing noise in millisecond pulsars because the long-term stability of these 
objects is required to detect nanohertz gravitational radiation. We show that a single scaling law is 
sufficient to characterize timing noise in millisecond and canonical pulsars while the same scaling law 
underestimates the levels of timing noise in magnetars. The scaling law, along with a detailed study of 
the millisecond pulsar B1937+21, leads us to conclude that timing noise is latent in most millisecond 
pulsars and will be measurable in many objects when better arrival time estimates are obtained over 
long data spans. The sensitivity of a pulsar timing array to gravitational radiation is strongly affected 
by any timing noise. We conclude that detection of proposed gravitational wave backgrounds will 
require the analysis of more objects than previously suggested over data spans that depend on the 
spectra of both the gravitational wave background and of the timing noise. It is imperative to find 
additional millisecond pulsars in current and future surveys in order to reduce the effects of timing 
noise. 

Subject headings: gravitational waves — pulsars: general — pulsars: specific (PSR B1937+21) - 
stars: neutron 



1. INTRODUCTION 

In most pulsars the residual times of arrival (TOAs) 
show structure that differs greatly from what is expected 
from measurement error alone and is typically consistent 
with having a red power spectrum. This structure is 
generically referred to as spin noise or timing noise (TN) . 

Rotational irregularities of the neutron star appear to 
be the dominant source of TN in most pulsars. Tim- 
ing noise is thought to arise from either changes in cou- 
pling between_the neutron star crust and its superfluid 
core (lJoneslll990D or magneto - spheri c torque fluctuations 
dChend Il987t Kramer et al.l 120061: iCordes fc Shannon! 
120081: iLvne et al.ll2010D . Thus the study of TN provides 
valuable insight into the structure of the neutron star 
and its magnetosphere. 

The observed strength of timing noise varies by more 
than eight orders of magnitude over the known non- 
accreting pulsars, which we subdivide into three classes: 
the magnetars, with spin frequencies v < 1/6 s _1 and rel- 
atively high magnetic fields; the rapidly-spinning and rel- 
atively weakly magnetized millisecond pulsars (MSPs), 
with spin frequencies v > 50 s ; and the canonical pul- 
sars (CPs) with both spin frequencies and magnetic field 
strengths between the two other classes. Some magnetars 
show root mean square (rms) TOA variations of many 
tens of seconds on time scales of years, whereas the most 
stable MSPs have not shown evidence of TN at the 200 ns 
level over decade-long time scales. 

Millisecond pulsars, which have stability compara- 
ble to the best terrestrial clocks, continue to be in- 
tensely studied. Their low levels of TN enable other 
TOA perturbations to be quantified, such as the rel- 
ativistic effects in pulsars with massive (white dwarf) 
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companions (IVerbiest et al.ll2008[). recoil from planetarv- 
mass companions (Konacki & Wolszczan 2003), and the 



appear to posses s intrinsically superior timing stability 



presently-und etected stochastic background of gravita- 
tional waves (iDetweilerl 119791: iHellings fc Downsl 119831: 
IJenet et al.l l2006| ). Interest in the detection of grav- 
itational waves with pulsars has intensified in recent 
years due to the improvement in MSP timing precision. 
This improvement can be attributed to technological ad- 
vancements hi_teJfiscoperecei vers and signal processing 
equipment (iDemore st 2007), improved analysis methods 
(jvan Strate n 2006) , and the discovery of new pulsars that 

posses s 
(jOrd et al.H2006h . 

Long-term timing stability of millisecond pulsars is 
necessary to detect the small correlated perturbations in 
the TOAs associated with passage of gravitational waves 
through the solar system. It has been suggested that if 
sub— 100 ns stability over 5—10 years can be achieved 
for a number of millisecond pulsars (currently estimated 
at A^pta = 20 — 40) in a pulsar timing array (PTA), a 
stochastic background of gravitational wav es at a cos- 
mologically significant level can be detected ()Jenet et al.l 
2005). Only two MSPs have shown any measurable TN, 
making characterizing as well as forecasting TN in MSPs 
difficult. However, the strength and properties of TN will 
certainly affect the detection of gravitational waves, even 
if TN is latent in most objects at present. 

In this paper we analyze TN throughout the pulsar 
population and assess the strength of TN in MSPs. In 
$2] we summarize the phenomenology of TN and show 
that random walk models and related non stationary pro- 
cesses can be used to model most observed TN. In <J3]wc 
suggest two tools for diagnosing TN: one that is appro- 
priate for assessing the long-term stability of MSPs and 
another that can be used to classify and compare TN 
throughout the pulsar population. In we derive seal- 
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ing relationships for TN in canonical pulsars, millisecond 
pulsars, and magnetars. We find that millisecond pulsars 
have TN that is consistent with that observed in canon- 
ical pulsars. We further link MSPs to CPs by showing 
that the behavior of the MSP B1937+21 is similar to 
that found in the canonical pulsar population. In con- 
trast, magnetars are found to possess TN that exceeds 
the amount expected from extrapolation from the other 
populations. In $5] we conclude that TN is present at 
levels that affect the observation strategies employed in 
pulsar timing arrays and suggest detecting gravitational 
radiation requires timing observations of more pulsars 
than previously estimated. In that section we also dis- 
cuss techniques for mitigating TN and improving the sen- 
sitivity of a PTA to gravitational waves. 

2. TIMING NOISE: PHENOMENOLOGY 

Timing noise is manifested as structure in the residu- 
als of a fit to pulsar TO As. A single TOA is determined 
by comparing a profile formed from averaging a large 
number of pulses with a template profile. The averaging 
is conducted to both increase the signal to noise ratio 
and decrease the effects of jitter associated with intrin- 
sic pulse-to-pulse phase and amplitude variations. The 
TOAs are then compared to a model that accounts for 
the propagation of the pulse from the pulsar to the earth 
and refers the arrival times to the solar system barycenter 
(jEdwards et al.ll2006t ). The fit includes terms accounting 
for periodic variations associated with the motion of the 
earth about the solar system barycenter and the reflex 
motion of the pulsar due to a companion, if the pulsar is 
in a binary system. The fit also includes secular terms 
that account for the unknown spin-down of the pulsar, 
and secular, but frequency-dependent terms that correct 
for the propagation of the radio pulses through plasma in 
the interstellar medium. It is essential to fit for the pul- 
sar spin frequency v and frequency derivative v because 
these quantities are intrinsic to the pulsar and cannot be 
predicted using any other technique. With the exception 
of a few young pulsars (such as the Crab and Vela pul- 
sars) the values of the higher order frequency derivatives 
associated with pulsar braking are not measurable on the 
year to decade observing spans over which pulsars have 
presently been observed. 

The residuals &{t) of the fit are used to assess the 
validity of the timing model and identify the presence 
of unmodeled periodic and secular trends. The rms of 
the residuals over an observing span of length T, after a 
second-order polynomial fit is given by 

1 Nt 

i 

for an observation comprising Nt samples at times £j, 
with i = l,N t . 

The variance erjg 2 can be subdivided into a white com- 
ponent erjy and a red component er^. N 2 that in canonical 
pulsars is usually dominated by TN: 

4 i2 (t) = 4n,2(t) + ^. (2) 

In this discussion red is used to label processes which 
have ensemble average power spectra that have greater 
power at lower fluctuation frequencies (red spectra) and 



white for processes that have equal levels of power at all 
fluctuation frequencies (white or flat spectra). 

There are a number of TN models that are distin- 
guished by the time evolution of the residuals, or equiv- 
alently the shape of the power spectrum of the residuals. 

One set of models is based on random walks of the 
spin properties of the pulsar. We consider random walks 
in pulse phase (RWo), frequency (RWi), and frequency 
derivative (RW2) as useful archetypal processes. Because 
the processes correspond to white noise in v, v, and v, 
they produce rms residuals that scale proportional t o 
T 1 / 2 , T 3 / 2 , and T 5 / 2 , respectively (jBovnton et al.l[l972l ) . 
and ensemble average power spectra wi th spectral in- 
dices of —2, —4, and —6, respectively (Harding et al. 
1990). For these processes the residuals have non- 
stationary statistics. For reference, we note that a grav- 
itational wave background from merging massive black 
holes produces rms residuals that scale proportional to 
T 5 / 3 and a power spectrum with a spectral index of 
-13/3 (j.Taffe fc Backenl27)03h . 

Band-limited noise (BL) is associated with processes 
that have low and high frequency cut-offs, in which the 
rms residuals increase for some time with a slope depen- 
dent on the particular spectral shape of the process. Af- 
ter a time associated with the low- frequency cut-off of the 
band T out = l//w> the rms timing noise will plateau. 
An example of a BL process is the perturbation induced 
by a wide asteroid belt around a pulsar (R. Shannon et 
al., in preparation). 

In many pulsars it appears that multiple types of TN 
occur at once. However, random walks provide a good 
basis f or modeling non-s t ationa ry co mponents of timing 
noise. iCordes fc Downs! (|1985[ ) and iD'Alcssandr o et al.1 
(|1995f) conducted detailed analyses of complementary 
sets of canonical pulsars. While they found that TN 
in most canonical pulsars cannot be explained by a sin- 
gle random walk process, both suggest that a mixture of 
random walks in v and v and discrete jumps in 0, i/, and 
v were compatible with the TN. 

Alternative models for timing noise include periodic 
and quasiperiodic processes. These models have gained 
favor because of recent reports of periodic and quasiperi- 
odic contributions the the residual TOAs for a few pul- 
sars. For example, PSR B1931+24 (|Kramer et alJl2006f > 
shows jumps between two states with distinct spin down 
rates v at quasip e riodic tim es. In a study of 366 pul- 
sars, iHobbs et all (|2010Q and iLvne et al.l (|2010D identify 
a few pulsars (ss 2% of their sample) that contain peri- 
odic or quasiperiodic components in residual time series 
and switches between distinct states of v. They also 
found that the different levels of v have unique average 
pulse profiles and they propose that this form of tim- 
ing noise can be corrected. In a substantial fraction of 
the identified cases of periodicity or quasiperiodicity, the 
model included a significant v that is attributed to non- 
stationary timing noise that augments any periodic or 
quasiperiodic component. We discuss the possibility of 
mitigating timing noise further in £15.31 

While in some cases, there is clear evidence of a pe- 
riodic or quasiperiodic contribution to the TOAs, in 
other cases, realization to realization variation can mimic 
quasiperiodic behavior. To demonstrate this we simu- 
lated residual curves for RWo, RWi, and RW2 random 
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walks. In the top panel of Figure [TJ we show four re- 
alizations of quadratic-subtracted residual TOAs for the 
same RWi process. In the plots residual curves show be- 
havior that mimic quasiperiodicity, irregular behavior in 
which higher order polynomials dominate the TN, and 
cubic-dominated behavior with both v > and v < 0. 

We use the number of zero crossings to quantify the 
morphological variations in single realizations of RW pro- 
cesses. Realizations that have a large number of zero 
crossings will appear quasiperiodic or irregular. In con- 
trast, realizations that have three zero crossings will ap- 
pear cubic and match what is expected from ensemble 
average statistics. In the bottom panel of Figure Q] we 
show a histogram of the number of zero crossings for each 
quadratic-subtracted polynomial for 4000 realizations of 
RWi, and RW2 processes. A significant fraction of the 
realizations of both RWi and RW2 processes show > 3 
zero crossings and a few show > 6 zero crossings. The 
number of zero crossings for residuals of RWo random 
walks is not displayed, because this random walk has a 
shallow spectrum and thus single realizations show very 
irregular behavior that typically have > 10 zero cross- 
ings. 

3. TIMING NOISE: DIAGNOSTICS 

Two approaches have been used to characterize the 
strength of timing noise in radio pulsars. The first 
uses the total TN aft er a second order fit cttn,2- 
ICordes fc HelfancH (|1980l) define the activity parameter 
as 



A = log 



CTN,2 (? 1 )crab 



(3) 



which measures levels of TN relative to the Crab pul- 
sar and represents a time-independent measure of the 
strength of the TN, assuming that pulsars show TN with 
similar time variability to the Crab pulsar. 

A second set of methods characterizes timing noise us- 
ing the frequency second derivative i> calculated from 
a cubic fit to the TOAs. Some groups have directly 
used v to assess the s trength of the TN (jUrama et al.1 
120061: lChukwude|l2007l) and correlated it w ith other pul- 
sar parameters. lArzoumanian et al.l (|1994f ) assessed the 
strength of TN using a parameter 



(4) 



A 8 = log [ i^Tj 



where v is measured over an observing span of T§ = 10 8 s. 
While the cubic term will dominate the variance of TN 
in the ensemble average of any red process with a mono- 
tonically decaying spectrum, in a single realization higher 
order terms may contain a large portion of the TN. Thus 
statistics based on v tend to underestimate the amount 
of TN in these processes. Additionally, the statistic Ag 
is model-dependent because v on average increases with 
length of observing span for red noise processes, much 
like the total rms residuals. Therefore to properly com- 
pare values of Ag or v in observations of different lengths 
a model-dependent time scaling needs to be included. 

A dime nsionless Allan v a riance -like parameter er 2 is de- 
scribed in Matsaki s et al.l ([1997) that can be used to es- 
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Figure 1. Upper panel: Four realizations of RWi timing noise. 
Curve a has a large number of zero crossings and behavior that 
could be misidentificd as quasi-periodic in spectral analysis. Curves 
b shows behavior that is irregular. Curves c and d show behavior 
in which the cubic term is dominant, with u > in curve c and 
v < in curve d. Bottom panel: Histogram of the number of zero 
crossings for processes RWi (thick lines) and RW2 (thin lines) after 
including a quadratic fit. Both processes show realizations where 
the number of zero crossings is much larger than 3 that mimic 
quasiperiodicity in spectral analyses. The number of zero crossings 
for RWo is off the scale of the graph. 



timatc pulsar stability, 



2V5 



(5) 



where <7i> (T) is the rms of v over observing spans of length 
T . Because the parameter uses v to estimate TN, it also 
will in general underestimate the total TN. 

Previous methods do not provide satisfactory diagnos- 
tics for TN. We therefore suggest that the rms timing 
noise (after a second order fit) is the basis for any proper 
diagnostic of TN, and propose two closely-related tools 
for diagnosing TN in pulsars. 

To estimate the timing stability of a pulsar we use the 
post-fit rms TN scaled to v, v, and time span T, 

<ttn,2 = C 2 v a \vfT\ (6) 
where the parameters C2, a, /3, and 7 are estimated over 
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the entire pulsar population. 

A diagnostic suitable for comparing timing noise across 
the pulsar population is the relative TN parameter 



c 



(7) 



which is the measured TN <7xN2) normalized by the 
global fit <ttn,2 from equation ©. The relative TN pa- 
rameter is similar to the activity parameter A, but in- 
stead of normalizing to the properties of one pulsar (i.e., 
the Crab pulsar), the TN is compared to the best fit 
across all objects. This statistic can be used to identify 
outlying objects. If a pulsar shows £ <C 1 it has smaller 
levels of TN than expected. If a pulsar shows £ ^> 1, it 
produces larger levels of TN than expected. We note that 
because this parameter depends on the modeled timing 
noise it depends on the observations included in the fit. 
The parameter values will change when more objects are 
included in the fit, or objects are included over longer ob- 
serving spans. As a result £ will change when additional 
observation of TN are included. If the new observations 
have statistically similar behavior as the present observa- 
tions of TN, the fit will not change in a significant way. If 
the additional objects have different behavior (for exam- 
ple, if timing noise became stationary over very large T), 
the revised values of C will better identify the outlying 
objects. 

4. TIMING NOISE ACROSS NEUTRON STAR 
POPULATIONS 

In this section, we show how rotational TN varies 
across the canonical pulsar, millisecond pulsar and mag- 
netar populations. Previous analyses of TN have focused 
on canonical pulsars and fit for only a limited number of 
parameters using the statistical tools described in the 
previous section. Instead we will use <ttn,2 and C, which 
are defined in equations (j6|) and ([7]), respectively. 

For our analysis we compiled observations of TN from 
many sources in the literature. In Appendix[A]we present 
the observing campaigns that we use and describe our 
methods for calculating cttn,2- Our analysis includes 
Nt = 1213 time series, from approximately 450 distinct 
pulsars, which include Njj = 591 detections of TN and 
Ntjl = 622 upper limits. Our analysis excludes young 
objects that have measured frequency second derivatives 
v that are attributed to pulsar braking. Plots displaying 
the rms timing noise otn,2 versus v and i> are displayed 
in Figure [2] 

4.1. Maximum Likelihood Analysis 

We use a maximum likelihood approach following 
iDewey fc Cordesl ([1989) to find the best fit parameters 
for equation ([6]) in logarithmic space 

ln(<TTN, 2 ) = lnC 2 + aln(i/) + /31n |i>_ 15 | + 7 ln(T yr ){8) 

with <7tn,2 expressed in /is, v expressed in s _1 , z>_i5 
expressed in 10~ 15 s~ 2 , and T yr expressed in years. 

A fifth parameter 5 is incorporated in the analysis to 
account for the large scatter in the strength of the timing 
noise. This scatter is associated with both realization to 
realization variation and non-modeled parameters that 
are assumed to be independent of v and i>, such as neu- 
tron star mass and other physical elements of TN. 



We assume that <ttn,2 is log-normally distributed; 
therefore the probability density function (PDF) of mea- 
suring rms residuals <7TN,2,i is 



/<r T N( cr TN,: 



1 



V2^2 



exp 



ln((7TN,2,i/ °"TN,2,' 

2P 



ISO 



where (Ttn,2,i = &TN,2,i(C2, ot, /?, 7, ft) is the modeled red 
noise component. We define the probability P,; as the 
product of the PDF and the measurement error 

Pi = / CT TN( CT TN,;)A(lno- TNii ) = f aTN (g T N,i) AaTN '\ ^0) 

0TN,i 

which assumes that the measurement error is small rela- 
tive to 5, a situation that is confirmed below. 

We also incorporate upper limits from many observa- 
tions using the probability 



Pi 



UL,i 



1 



1 



-erfc 



ln((7TN,i/PTN,i) 

6V2 



(11) 



where erfc is the complementary error function. The 
total probability for No detections of timing noise and 
iVuL upper limits is then 



No JVul 

P(C 2 ,a,/3, 7 ,<5) = n^II P UL,. 



(12) 



For each population, the probability space was exam- 
ined using a series of grid searches. An initial search 
was conducted with a coarse grid and a wide range of 
values in each parameter to identify the best-fit location 
and determine if multiple values of any of the parame- 
ters were allowed. Refined grid searches were conducted 
with much narrower ranges in values with fine gridding 
to calculate parameter estimation error and covariance. 

4.2. Canonical Pulsars 

A fit restricted to only the canonical pulsars yields well 
determined values of the parameters in equation ([6]) . We 
find significant correlation of the strength of timing noise 
with v, i>, and T. The estimated parameter values and 
their respective ±2<r (95%) confidence intervals are pre- 
sented in Table [TJ The scaling of timing noise with ob- 
serving span (7 = 1.9 ± 0.2) is found to be intermediate 
to scalings expected from RWi and RW2, for which we 
would expect 7 = 3/2, and 7 = 5/2, respectively. 

Realization to realization variation associated with a 
stochastic process provides insufficient scatter account 
for the spread in timing noise that is characterized by the 
fit parameter S — 1.6 ±0.1. We simulated a large number 
of realizations of random walks RWo, RWi, and RW2 and 
determined that realization to realization variation will 
induce a scatter in each process of S = 0.23, 0.46, and 
0.60, respectively We conclude that the inferred value 
of S includes additional contributions from the actual 
TN processes that are not captured by single idealized 
random walk models. 

These findings generally agree with previous studies 
of timing noise in canonical pulsars that have concluded 
both that TN typically shows non-stationary behavior 
characterized by a red power spectrum and have estab- 
lished correlations between timing noise and other spin 
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Figure 2. Scatter plots showing the correlation between measured rms timing noise crxN,2 with spin frequency v (left panel) and frequency 
derivative, v (right panel). Filled symbols represent detections of timing noise, and open symbols represent la upper limits. Magnetars 
(y < 1/6 s — 1 ) are identified by stars, canonical pulsars (50 s~ 1 < v < 1/6 s — 1 ) are identified by circles, and millisecond pulsars (y > 20 s _1 ) 
are identified by squares. The observations encompass a wide range of observing spans 0.1 yr < T < 30 yr. 



parameters. iCordes fc Helfandl (|1980[ ) found a correla- 
tion between the activity parameter A and period deriva- 
tive P = —vjv in 50 pulsars. iDewev fc Cordesl (p8l 
found a correlation between this activity parameter and 
P and P in observations of 40 canonical pulsars. 

The scaling law models the timing noise over the entire 
range of observing spans and we find no evidence for 
band-limited timing noise. In Figure [3] we display the 
relative TN parameter £ versus observing span. If timing 
noise was band-limited over current observing spans, the 
amount of timing noise would plateau, and at large T, 
the fit would be poor and (C 1. In addition we found 
consistency between fits to CP observations with T < 
10 yr and T > 10 yr. 

Analysis of a large on-going timing campaign of 366 
pulsars at the Jodrell Bank Observatory with observ- 
ing sp ans of 10 to 36 years is presented in IHobbs et al.l 
(2010). They calculated a scaling relation between 
a z (10 yr) and v and v 

^(10yr) = 10- 11 -V-°- 4 | ! >_ 15 | - 8 , (13) 

where v is measured in s^ 1 and i>-\§ — 10~ 15 s~ 2 . Our 
scaling relationship cttn,2 oc v ~ -^ - 2 ^ 1 - ^ - 05 [ s incon- 
sistent with this. In their analysis, o~ z includes contribu- 
tions from additive white noise. If we conduct our analy- 
sis with (7,^.2 (i.e., include the white noise) instead of only 
to the red component <ttn,2, we find a more consistent 
scaling relationship of agg t 2 oc ^-0- 7 ± 01 | z >| - 76 ± - 02 . 

4.3. Millisecond Pulsars 

Only two MSPs have shown significant levels of timing 
noise: PSR B193 7+21 (discussed in d etail below), and 
PSR B1821-24. (IVerbiest et al J 120091) . The model for 
canonical pulsars over-predicts the level of timing noise 
observed in PSRs B1937+21 and B1821-24, as displayed 



I I I =1 

□ 



o 

o o 




10 20 30 40 

T (yr) 

Figure 3. Scatter plot showing the timing noise parameter £ 
versus observing span T. We have used the timing noise model 
of the joint CP+MSP fit to calculate There is no evidence 
for a change in timing noise characteristics over longer observing 
spans. Filled symbols represent detections of timing noise, and 
open symbols represent 2a upper limits. Magnetars (y < 1/6 s — 1 ) 
are identified by stars, canonical pulsars (50 s _1 < v < 1/6 s _1 ) 
are identified by circles, and millisecond pulsars (v > 20 s _1 ) arc 
identified by squares. The solid lines indicates f = 1. The dashed 
lines are the ±1<t variation of f, as inferred from value of <5 inferred 
from the joint CP+MSP fit. 

in Figure SI For both of these objects, the observed levels 
of TN are below the levels expected from the CP-only fit 
by a factor of one to two times 5. 
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The best fit to the MSP population, listed in Table [Q 
has larger fitting uncertainties because the few observa- 
tions of TN and constraining upper limits are restricted 
to smaller ranges in v, z>, and T. The fit is dominated by 
the many observations of timing noise in PSR B1937+21. 
For a few MSPs, observations provide restrictive upper 
limits, but for many the expected levels of timing noise 
are not constraining at the levels predicted by the CP- 
only fit. 

We also conducted a joint fit of the MSP and CP pop- 
ulations. In Figure |4] (right panel) we plot the observed 
levels of TN versus the levels predicted from the joint 
fit. Visual inspection suggests that this fit provides a 
good model of the timing noise in the MSP population 
because the levels of timing noise observed are within the 
±1<5 band and the upper limits exceed levels predicted by 
the model. 

The quality of fit is quantified by comparing observed 
and predicted levels of timing noise using a x statistic 

-2 _ V"^ (l ncr TN,2,i — hl(TTN,2,i) 2 

x - 2_j p ' i 14 - 1 

where only the No observations of detected timing noise 
are included and upper limits are excluded. If a model 
provides a good fit to the data, \ 2 follows a x 2 distribu- 
tion. For a fit to any population, the number of degrees 
of freedom is A^dof = No — 5 if the population included 
in the fit (because 5 parameters are included in the fit), 
and ./Vdof = No if the population was not included in 
the fit. In Table [21 we list the values of x 2 and corre- 
sponding probabilities P that each fit models the indi- 
vidual populations. This analysis confirms that the joint 
CP+MSP fit is a good model for both the CP and MSP 
populations. 

The similarity of timing noise in MSPs to that in 
canonical pulsars is strengthened by examining the tim- 
ing residuals of PSR B1937+21 in greater detail. In 
terms of statistical precision, PSR B1937+21 is the best 
MSP in which to study timing noise because it shows the 
largest levels of TN of any MSP. 

In order to assess the strength and type of timing noise 
in PSR B1937+21, we investigate how cttn,2 scales with 
observing span by combining the results of many tim- 
ing programs presented in Appendix [X] In Figure [5l the 
rms residual timing noise is plotted versus observing span 
length for the various campaigns. In this figure we also 
show model curves for random walks RWo, RWi, and 
RW2 scaled to an ensemble-average rms of 2 /is over an 
8 year observing span, combined in quadrature with a 
0.15 jj,s white noise component, which matches the lev- 
els of noise in the short time span observations displayed 
in Figure [5] Over short time spans, the residuals are 
dominated by white noise associated with instrumental 
sensitivity, pulse av eraging effects, and diffractive inter- 
stellar scintillation (jCordes et al.lfl990l) . 

Inspection of this plot shows that the scaling of <7tn,2 
with T is intermediate to RWi and RW2 and therefore 
the scaling of TN with time is consistent with the ob- 
served scaling in the CP population (cttn oc T 2±0 - 2 ). 
This scaling is inconsistent with RWi or RW2 random 
walks. We note however that the power law scaling is 
altered if the amplitude of the RW steps have a power- 
law distribution (for further discussion see Appendix C of 



iCordes fc Downslll985l ) . The level of TN does not plateau 
over large T so we conclude that the timing noise shows 
no sign of being band-limited on the current observation 
time scales. 

Observational bias has lead to detection of TN in only 
two MSPs. The expected levels of timing noise in the 
other MSPs are below current observing sensitivity. Tim- 
ing noise is observed in PSR B1937+21 because it has a 
considerably larger v than other intensely studied MSPs. 

4.4. Magnetars 

The probabilities that the models fit the observed TN 
in the magnetar population are displayed in Tabled We 
find that the magnetar-only model provides a good fit to 
the observations (not surprisingly) but all other models 
under-predict the timing noise in the magnetar popu- 
lation. We conclude that magnetars show timing noise 
levels in excess of those found in the other populations 
of neutron stars. 

4.5. Discussion: Timing Noise in Pulsar Populations 

There are physical reasons to expect timing noise in 
MSPs and CPs to be consistent and follow a combined 
power law. Magnetic fields almost certainly play a role 
in generating timing noise. If surface or magnetospheric 
magnetic fields play the dominant role (likely the case in 
the case of a magnetospheric origin of timing noise, and 
plausibly the case if TN in associated with crust-core 
interactions), then timing noise would depend on pulsar 
spin parameters. Differences in the relationship between 
magnetic field strengths and the spin parameters v and 
v between the populations would cause a break-down 
in the scaling relations. For example, if timing noise is 
associated solely with the neutron star core there may 
be a breakdown in the scaling relationship because both 
CPs and MSPs may well have similar internal magnetic 
field strengths despite different spin characteristics. 

Ultra-strong magnetic fields may cause the excess tim- 
ing noise observed in magnetars. Unlike canonical pul- 
sars and millisecond pulsars, magnetar radiation appears 
to be driven by the decay of magnetic fields, with some 
theories sugges ting that the radia tion is associated with 
crust cracking (jHurlev et alJl2005T ) that may be enhanced 
compared to CPs. This cracking could drive rotational 
irregularities that contribute to the observed excess in 
timing noise. 

Many radio pulsars have been discovered that have v 
and v approaching those of the magnetars. Additional 
timing observations of high magnetic field radio pulsars 
are needed to properly assess the difference between radio 
pulsars and magnetars. 

5. IMPLICATIONS FOR GRAVITATIONAL WAVE 
DETECTION 

The presence of timing noise will significantly affect 
the sensitivity of a pulsar timing array to gravitational 
radiation. At present, most MSPs show residuals consis- 
tent with white noise. Based on the scaling laws derived 
in 21 we predict that TN will be identified in many ob- 
jects when they are monitored over longer time spans or 
observed with higher precision. 

In Table [31 we list the MSPs that at present show the 
best timing stability. We show the expected rms timing 
noise over 2 yr, 5 yr, and 10 yr observing spans, based on 



Timing Noise in Pulsars 



7 




a TN _ 2 Os) a TN _ 2 (/xs) 

Figure 4. Correlation between predicted rms TN <ttn,2 and measured rms TN <ttn,2 for the CP-only model (left panel) and the joint 
CP+MSP model (right panel). Filled symbols represent detections of timing noise, and open symbols represent 2a upper limits. Magnetars 
(y < 1/6 s ) are identified by stars, canonical pulsars (50 s^ 1 < v < 1/6 s~ 1 ) are identified by circles, and millisecond pulsars (MSPs) 
are identified by squares. Points above this line are observations that have levels of timing noise greater than expected (£ > 1). Points 
below this line are observations that have levels of TN less than expected by the model (C < 1). The dashed lines show the expected width 
as estimated by the parameter <5, corresponding the ±lcr (67%) width. The dotted lines show ±2cr (95%) width. The CP-only model 
overestimates the strength of the timing noise in the MSPs, and both fits underestimate the level of timing noise in the magnetars. 



Table 1 

Best Fit Parameters 



Fit 


ln(C 2 ) 


OL 


P 


7 


<5 


N D (N VL ) 


CP 


2.0 ±0.4 


— 0.9 ± 0.2 


1.00 ±0.05 


1.9 ±0.2 


1.6 ± 0.1 


563 (470) 


MSP 


-20 ± 20 


1±2 


2±1 


2.4 ±0.6 


1.2 ±0.5 


12 (147) 


CP+MSP 


1.6 ±0.4 


-1.4 ±0.1 


1.1 ±0.1 


2.0 ±0.2 


1.6 ±0.1 


575 (617) 


MAG 


3±7 


-1±3 


1.5 ±0.6 


3±1 


2.1 ± 0.7 


15 (7) 


CP+MAG 


2.4 ±0.5 


-1.4 ±0.2 


1.13 ±0.07 


1.7 ±0.2 


1.7 ±0.2 


578 (477) 


ALL 


2.2 ±0.4 


-1.5 ±0.1 


1.2 ±0.1 


1.8 ±0.1 


1.7 ±0.1 


590 (624) 



Note. — Best fit parameters and ±2<r confidence limits for different populations of 
pulsars. Ne> is the number of time series with detected timing noise used in the fit. 
Atjl is the number of time series with upper limits of timing noise used in the fit. 



the scaling relationships discussed in SJD We also show 
the ±lcr variation that would be expected based on the 
observed spread of timing noise, and measured limits 
on the amount of TN over 10 yr observing spans. For 
these pulsars TN is likely present at the 10 ns to 100 ns 
level and will therefore affect the detection of other TOA 
perturbations with amplitudes at these levels, such as a 
gravitational wave background. In addition, we show 
predicted and measured levels of timing noise for PSR 
B1937±21. The large levels of timing noise imply this 
object will not contribute to the sensitivity of a PTA to 
GWs. 

In the following we examine in detail the effect of the 
presence of timing noise on the properties of the PTA. 

5.1. Timing Noise & PTA Sensitivity 

To estimate the level of timing stability required to de- 
tect GWs, we calculate the GW detection signal to noise 
ratio (SNR) using a particular detection scheme. We 



note the resulting conclusions are general and are rele- 
vant to all detection methods, including methods that 
are imple mented using either frequentist or Bayesian a p- 
proaches (jJenet et al.l l2005: va n Haasteren et aT1 l2009). 



5.1.1. Best Case: Gravitational Waves and Timing Noise 

Only 

In order to assess the best possible case, we first con- 
sider TOAs that contain only perturbations associated 
with gravitational waves and timing noise. For each pul- 
sar k at observation epoch i, the time of arrival pertur- 
bation (before any fit) Ski is altered by the correlated 
component of the GWB passing through the solar neigh- 
borhood eki, the uncorrelated component of the GWB 
outside of the solar neighborhood pki , and uncorrelated 
TN r kt : 



Ski 



e-ki + Pki + r kl . 



(15) 
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Table 2 

Fit Comparisons 



Model Fit 



CP MSP MAG CP+MSP CP+MAG ALL 



Family N D X 2 P X 2 P X 2 P X 2 P X 2 P X 2 P 

CP 563 532 0.8 10 52 (10" 20 ) 1996 (10" 20 ) 538 0.72 477 0.99 493 0.98 

MSP 12 37 10~ 3 - 6 14 0.05 38 10" 3 ' 8 3.8 0.8 3.7 0.99 7.0 0.4 

MAG 15 176 (10~ 20 ) 10 4 1 (10~ 20 ) 10 0.4 98 (lO -10 ) 63 lO" 7 - 1 54 10" 75 

Note. — Goodness of fit estimates for the canonical pulsars (CP), millisecond pulsars (MSP), and the magnetars (MAG) for 
models of sub-populations. For the model fits the No detected time series were used to calculate a x 2 statistic to assess the goodness 
of fit for the subgroups of the pulsar population. Using this statistic, we calculated the probability P that a fit modeled the observed 
levels of timing noise. Probabilities in parentheses are upper limits. 



Table 3 

Expected Levels Timing Noise for PTA Pulsars 











T = 2yr 






T = 5 yr 






T 


= 10 yr 




Object 


V 




<TTN 


<5"TN,L 


t?TN,(7 


(Ttn 


»TN,L 


0"TN,[7 




*TN,L 




"TN.meas 




(s- 1 ) 


(10- 15 s- 2 ) 


(ns) 


(ns) 


(ns) 


(ns) 


(ns) 


(ns) 


(ns) 


(ns) 


(ns) 


(ns) 


J0437-4715 


174 


-1.73 


35 


7 


180 


210 


41 


1100 


830 


160 


4300 


< 200 


J1713+0747 


219 


-0.41 


5 


1 


26 


31 


6 


160 


120 


23 


630 


< 200 


J1744-1134 


245 


-0.54 


6 


1 


31 


36 


7 


190 


140 


27 


730 


< 620 


J1909-3744 


339 


-1.62 


13 


2 


68 


79 


15 


410 


310 


60 


1600 


< 170 


B1937+21 


623 


-43.30 


230 


44 


1200 


1400 


270 


7200 


5500 


1100 




1500 



Note. — Estimated strength of timing noise for selected PTA pulsars and PSR B1937+21 over 2 yr, 5 yr, and 10 yr observing 
spans based on the best-fit model to the canonical pulsars and the millisecond pulsars (as defined in Table [TJ- For each pulsar 
we list the spin frequency u and spin frequency derivative i>. For each observing span we show the expected values 6tn an< ^ the 
la upper and lower limits: 0"tn,l and S"tn,(7> respectively. The limits are formally the quadrature sum of the parameter that 
quantifies the scatter of the distribution 8 and the estimation error associated with the model, but are dominated by 8. We also 
present the measured timing noise ffTN,meas (or upper limits) over fa 10 yr observing span. 



The perturbations e and p have the same rms strength. 
We define the SNR in the time series to be the ratio of the 
rms amplitudes (after a second order fit) of the correlated 
portion of the signal (i.e., e^) to the the uncorrelated 
portion of the signal (pki + r/w). Thus in the residuals 
from a single pulsar the SNR is at most unity and is 
smaller if TN is present. 

We now consider one approach to GW detection 
that involves forming a coherent sum (R. Shannon & 
J. Cordes, in preparation) of the residuals for ./Vpta pul- 
sars, which increases the SNR. The best case configura- 
tion is when all the pulsars are located in a small patch 
of the sky, but at different distances away from the ob- 
server. In this case is completely correlated between 
pulsars and pki and eui are uncorrelated. As a res ult, eki 
is amplified relative to pki and r^i by a factor V^YptA- 
The combined SNR in a single data block of span T of 
observations from iVpTA pulsars is 



where we have assumed the data set can be subdivided 
and M independent estimates of the TS can be calculated 
(for example by using data blocks of length Tm = T/M), 
resulting in an enhancement of the SNR by a factor of 
vM. We note that t here are alte r nativ e ways to subdi- 
vide the time series. Uenet etaLI (|2006t ) decompose the 
residuals using a set of orthonormal polyn omials and cal- 
culate a TS using each polynomial, while IVerbiest et all 
(2009) conduct an analysis in the Fourier transform do- 
main. In all cases the optimal value of M is limited by 
other sources of noise (like white noise) , which we discuss 
further in 35X21 

The scaling relationship of equation (|T7| is used to 
establish the properties of a PTA sufficient to detect the 
GWB. To detect the gravitational wave background with 
a strength ctgw,2(0 with SNRts,j\/ > ■S'min requires that 
the TN in an individual pulsar satisfy 



0tn,2(7m) < ctgw,2(7m)i 



' MNpTA 



TA 



'TN,2 



CO 



'GW,2 



CO' 



(16) 



2 

min 



1. 



(18) 



where the rms strengths of the GWB and the TN are 
characterized by <tgw,2(0 and <ttn,2(0, respectively. 
A test statistic based on the coherent sum has an SNR 

of 



The number of pulsars required to detect a GWB of 
strength <7gw,2(0 with an SNR greater than S m i n with 
TN at a level Wn,2(0 is 



o2 

at w min 

M 



1 



/ OTN,2(?m) 
\ cr GW,2(r / 



M 



(19) 



TS,Af 




MJVi 



PTA 



CT TN,2( r Af)/CT GW 2 (T M ) 



(17) 



Here we make two preliminary estimates of the require- 
ments for GW detection using equations (fl8|) and ([i"9|) . 
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T (yr) 

Figure 5. Plot of the rms residuals <t^2 = ^/°tn,2 ^~ °w versus 
observing span T for PSR B1937+21 and simulated random walks. 
The large scatter in the observations at T = 1 yr is associated with 
variable levels of white noise across timing programs. The expected 
variation for random walks in phase (RWo, thinnest lines), fre- 
quency v (RWi, medium thickness lines), and frequency derivative 
v (RW2, thickest lines) are also displayed. The 95% confidence 
limits (based on simulations of a large number of realizations) are 
shown for each process. The strength of the random walks are nor- 
malized to cttim^ = 2 /xs at T = 8 yr, which is indicated by the 
cross on the plot. To each curve, white noise with rms strength of 
ayv = 0.15 fis is added. This level is denoted by the horizontal line 
marked WN. 

In £15.21 we give a more detailed assessment that uses the 
model of TN in the pulsar population presented in §4[ 

As a first example, we estimate pulsar stability require- 
ments to detect the expected stochastic background of 
merging massive black hole (MBH) binaries. Stochas- 
tic GWBs are typically characterized by their expected 
strain response h c (f) and not <tgw.2- In Appendix [Cl 
we show how to calculate <tgw,2 from h c (f). The MBH 
background is presently considered the strongest plausi- 
ble GWB, and is expected to induce a strain response of 
h c (f) = Ao(f/l yr -1 ) -2 / 3 , where the va lue of A is es- 
timated to be between 10~ 16 a nd 10~ 15 (jJaffe fc Backer! 
[20031: iSesana fc Vecchiol [20Tol) . Over a T = 5 yr ob- 
serving span, the MBH GWB will contribute ctgw,2 = 
19 ns (Ao/10 -15 ) to the times of arrival, as indicated 
in Table [2J which presents results for this section and 
for £ 35.21 To achieve a signal to noise ratio in the de- 
tection statistic of 5 m j n = 5 for a PTA comprising 
ATpTA = 40 pulsars using M = 1 observation blocks, 
timing noise must be limited to cttn.2(T' = 5 yr) < 

(v / 375)<7 G w,2v 71 = 5 yr) ps 15 ns. 

We can also estimate the number of pulsars required 
to detect a GWB if TN levels are equal to the amplitude 
°tn,2 « 20 ns over 5 years exhibited by the pulsars in 
Table [31 A PTA comprising iVpxA = 70 pulsars would 
yield an SNR of S mnl — 5, assuming M = 1, and a 
GWB with the same properties as in the previous exam- 
ple. However, a number of MSPs are expected to have 



TN at levels below the scaling law and therefore the re- 
quired number of pulsars may be somewhat lower, which 
is described in £15.21 

5.1.2. Effect of White Noise on the Number of Independent 

Sub-blocks 

In general, the number of subdivisions M that max- 
imizes the SNR of the TS depends on the amplitude 
of other noise contributions to the residuals, in partic- 
ular white noise, which is guaranteed to be present in 
pulsar timing observations. We define a time scale Tm 
over which the expected GW signal exceeds the white 
noise levels (Tw,ts in the coherent time series by the same 
threshold as the TN, i.e., o- GW . 2 (T M ) = S min a w ,Ts{TM)- 
For a total observing span of length T there are M « 
T/Tm independent data blocks if T > Tm', if not, there 
is M = 1 data block. 

The random noise in the TS associated with the WN 
is given by 

ws(r)= ^ra- (20) 

where a n is the level of white noise in a single obser- 
vation and N b s (T) = R bsT is the number of obser- 
vation epochs in the interval of length T, and is char- 
acterized by an observation rate i? bs (or equivalently 
an observing cadence of R~ bs )- For a background with 
cr GW,2(2 n ) = &gT 5 / 3 , the minimum time is 

/ c \ 6 / 13 

T M =(^— p™— ) . (21) 

\<Jg y/NpTARobsJ 

For an array of 40 pulsars, with N b s {T) = 10 T yr 
(i.e., 10 observations per year), o~ n — 100 ns rms error 
per residual, and a GWB with strain spectrum h c (f) = 
10~ 15 (//1 yr -1 ) -2 / 3 , the minimum block size is T min 
2 yr for ./Vpta = 40 to 100. The minimum block length 
T m i n is approximately the same for both values of ./Vpta 
because of the weak dependence of M on A^pta, M oc 

3 /13 

iV pTA , for the assumed GWB background. 

5.2. The Fraction of MSPs Suitable for PTAs 

Using the stability requirements defined in equation 
(fTS)) , the fraction of MSPs suitable for inclusion in a PTA, 
^"iviSPi can be evaluated. This fraction is equivalent to 
the probability that a pulsar within the population has 
rms timing noise less than some threshold amount at over 
an observing span of length T. Based on our TN model, 
this probability is 

/lntTt r- 
din a dMp M {M) 
-co J 

x J dvdi>p v ,i,(v,i')pi nr j(\ncT\M,v,i',T), (22) 

where pm(M) is the PDF of the parameter distribution, 
p v .i, is the PDF of the pulsar distribution in v and v, and 
Pi ncr is the PDF of the level of timing noise, given the 
model parameters. 

In Appendix [B] we present methods for evaluating 
equation ([22)) . The sensitivities of PTAs comprising 
N p = 40 and 100 pulsars with a variety of observing 
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Table 4 

Timing Noise Constraints on MSPs Suitable for a PTA 









ApTA = 


40 




ApTA = 


100 


T 


CGW,2 


M 


fTN,2,i 


^MSP 


M 


"TN,2,( 


^MSP 


(yr) 


(ns) 




(ns) 


(%) 




(ns) 


(%) 


2 


4 


1 


3 


30 ± 7 


1 


7 


46 ± 8 


5 


19 


1 


14 


25 ± 6 


1 


32 


40 ±7 


10 


59 


2 


28 


37 ±7 


3 


31 


56 ±7 


20 


187 


5 


34 


50 ±7 


6 


45 


64 ±7 


5 


19 


2 


9 


41 ±7 


2 


15 


53 ± 7 


5 


19 


4 


4 


55 ± 8 


4 


7 


65 ± 7 



Note. — Fraction of pulsars suitable for inclusion in a PTA, 
^MSP> f° r arrays comprising Apta = 40 and 100 pulsars, over 
observing spans ranging from T = 2 yr to T = 20 yr. Wc 
have assumed a background with characteristic strain spectrum of 
hc(f) = 10 _15 (//1 yr -1 )~ 2 / 3 . We have also assumed a detec- 
tion signal to noise ratio of S m i n = 5, using equation 1181 . with 
M independent data blocks, as described in the main text. Also 
listed are the quadratic-corrected rms contribution of the gravita- 
tional wave background ffGW^fl 1 ) for the total observing span T 
and the threshold TN level <TxN,2,t(TAf ) for the sub-block length 
T M = T/M. 

spans T are investigated with the same conditions as in 
§5.11 We have modeled timing noise <7tn,2 using the joint 
CP and MSP model as presented in Tabled] and equation 

In Table 0] we show the fraction of pulsars suitable 
for inclusion in these PTAs. The fraction of suitable 
pulsars is smaller at T = 5 yr than T = 2 yr because 
the level of expected timing noise has increased relative 
to the GW signal but the number of sub-blocks M has 
not changed. To produce a PTA comprising -/Vpta pul- 
sars requires the investigation of a total sample of MSPs, 
Nmsp = -^pta/^msp pulsars. To calculate M we have 
again assumed that the pulsars are observed 10 times per 
year with an rms precision on a single TOA of 100 ns. 
For a PTA comprising 40 to 100 high quality MSPs and 
data spans of 5 to 10 years, our analysis indicates that a 
total MSP sample that is two to three times larger than 
ApTA needs to be investigated in order to identify high 
quality objects. 

5.3. Mitigating Timing Noise 

It has been previously noted that applying a low-pass 
spectral filter to the residual TOA time se ries can im- 
prove the signal-to-noise ratio in a PTA (jJenet et al.l 
2005). The presence and diversity of red noise in the 
MSPs necessitates filtering that is tailored to the prop- 
erties of each individual pulsar. Schematic power spectra 
for RWo, RWi, and RW2 random walks and the gravi- 
tational wave background are displayed in Figure El For 
systems in which RW2 is the dominant form of TN high- 
pass filtering (i.e, removing the lowest frequency compo- 
nents of the signal) can be used to mitigate the contri- 
bution of TN to the TOAs. It is not possible to develop 
a filter that mitigates RWi timing noise without also re- 
moving the gravitational wave signal because they have 
very similar spectral shapes. 

There is evidence that pulse profile information can be 
used to correct residual time series. iLvne et al.l {2010) 
identify a link between changes in pulse shape (proba- 
bly connected to mode changing) and changes in i>, and 
demonstrate that some timing noise can be corrected by 
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Figure 6. Schematic power spectra for stochastic processes that 
contribute to pulsar timing residuals. The dashed lines indicate 
fluctuation frequencies at which the TN exceeds the GWB for var- 
ious processes and the arrows associated with these lines identify 
region in which the GWB signal is accessible. Vertical dashed line 
A identifies the region in which the spin frequency noise (RW2) 
exceeds the gravitational wave background. Vertical dashed line B 
identifies the region in which phase noise (RWo) exceeds the grav- 
itational wave background. Vertical dashed line C identifies the 
region in which white noise (WN) exceeds the gravitational wave 
background. 

identifying the time of the mode changes and the esti- 
mated values of v. Even if all TN is associated with 
mode changing it is necessary to observe with a short 
cadence to accur ately determine th e time at which the 
change occurred. ILvne et ahl (|2010l ) argue that daily ob- 
servations would be necessary to adequately mitigate this 
mode changing timing noise in their objects. If objects 
are not observed with a sufficiently short cadence, the 
uncertainty in the time of mode change introduces resid- 
ual timing noise in the corrected time series with an am- 
plitude proportional to the uncorrected level of timing 
noise. As noted i n ^31 in half of the cases presented in 
ILvne et all (|2010l ). the correlation was found after a fit 
for v that removes non-stationary TN could not be cor- 
rected in their data. 

5.4. Future Prospects 

A large number of pulsars need to be studied because 
timing noise will limit the utility of many objects and 
MSP timing stability cannot be fully constrained with 
spin parameters v and v. As a result, if Apta pulsars 
are required for a significant detection of the GWB, a 
much larger n umb er of pulsars (Amsp = Apta/^msp, 
as outlined in £|5 . 2[) need to be discovered and character- 
ized. To constrain the timing stability it is necessary to 
conduct timing observations with sufficient precision to 
detect the presence of TN at the threshold level, as set 
by equation (|18p. For realistic PTA configurations and 
a reasonable detection SNR, the required stability level 
will be at most a factor of a few greater than the an- 
ticipated strength of the gravitational wave background. 
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An object is suitable for inclusion into the PTA if timing 
noise is not detected at this level. 

Pulsar timing arrays can be expanded by both incorpo- 
rating presently known objects with good intrinsic sta- 
bility that are currently excluded due to low flux and 
discovering new MSPs with suitable timing stability. 

Additional MSPs suitable for incorporation into a PTA 
are continually being discovered, with ongoing surveys 
with the Arecibo, Green Bank, Effelsberg, Parkes Tele- 
scopes; targeted searches for radio pulsar companions 
to Fermi gamma-ray point so urces; and in the near fu- 
ture with the LOFAR Array (jvan Leeuwen fc Stappersl 
l2010f ) . While occasionall y bright MSPs have been discov- 
ered (|Jacobv et al.ll2003f) . selection effects generally bias 
new discoveries toward fainter pulsars, and thus suitable 
objects require longer observations with more sensitive 
telescopes to mitigate white radiometer noise. 

The requirements for finding and timing ultra faint 
MSPs highlight the need to use high-gain telescopes such 
as the Arecibo telescope and the proposed Square Kilo- 
metre Array (SKA ). The SKA is estimated to find up 
to « 6000 MSPs (Sm its et al.l l2009f ) . If we conserva- 
tively estimate that 10% of the MSPs are suitable, there 
will be w 600 objects available for inclusion in a PTA. 
The very best of these could comprise a PTA sufficient 
to detect the GWB while a much larger PTA could be 
used to study in detail the GWB and detect and exam- 
ine individual GW sources. Large interferometers such 
as the SKA will be particularly important for improving 
throughput of timing campaigns because they can be di- 



vided into sub-arrays that can observe multiple objects 
simultaneously. 

6. CONCLUSIONS 

We have developed scaling relationships for timing 
noise in millisecond pulsars, canonical pulsars, and mag- 
netars. We find that timing noise in MSPs is consistent 
with that observed in canonical pulsars. The timing be- 
havior of the millisecond pulsar B1937+21 supports uni- 
versality of TN in CPs and MSPs. Latent timing noise is 
predicted to be present in other MSPs with similar prop- 
erties (but smaller) magnitudes to that in the CPs and 
PSR B1937+21, in accord with their smaller spin down 
rates. This timing noise may be measurable in many 
pulsars when either longer data sets or higher precision 
arrival times are obtained. Timing noise in magnetars is 
greater than that expected from extrapolation from the 
canonical pulsars. 
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APPENDIX 

A. REDUCTION PROCEDURE & TIMING CAMPAIGNS USED 

We have synthesized the results of many timing observations to construct the scaling relationships for timing noise 
(TN) in canonical pulsars (CPs), millisecond pulsars (MSPs), and magnetars (MAGs), as described in SJU and to 
conduct the case study of the MSP B1937+21 (as described in !j4.3[) . In Table [5] we summarize the timing campaigns 
used in the analyses, the average length of data span contained in the campaign, and the number and type of objects 
analyzed. 

In the following sections we outline the procedure used to properly combine the results from all of the campaign s. In 
§A.ll we describe how the root mean square (rms) timing noise cttn,2 is calculated from o ther TN diagnostics. In flA.2l 
we justify the threshold used for detecting the presence of red noise in timing data. In ^A.3l we justify the exclusion 
of some timing observations from this study. In ^ A.4t we describe the observations used that form the basis of our 
study of PSR B1937+21. 

A.l. Calculating otn,2 

We use the rms timing noise after a second order polynomial fit o'tn,2(7 1 ) as the primary observable property of TN, 
as justified in SJH 

In Table El for each reference we identify the type of observations reported. While some timing campaigns report 
otn,2 (coded TN in Table[S|), others report different but related measurements of TN. There are two notable conversions 
that are occasionally needed. Some campaigns report only the total rms residuals and the white noise level, and the 
red timing noise must be extracted from these quantities. For other campaigns, the timing noise is modeled in a 
functional form. 

Calculating <ttn,2 from the total timing noise and the white noise ( Code TW in Table [5|): Many timing campaigns 
report the total rms residuals and the levels of white noise in the observations. In this case, the amount of timing 
noise is the quadrature difference between the rms residuals o~gg t 2 and the white noise in the time series: 

CT TN,2 = °%,2 ~ a w- ( Al ) 

In these observations the the level of white noise aw reported comes from one of two sources: either the rms of residuals 
after the TN has been analytically modeled; or an estimate from the white noise in a single TOA. 

Including modeled timing noise, ( Codes S, STW, or H in Table\$jj: In some cases the fit includes terms A^tn(^) that 
model the timing noise. The model is typically a series of polynomials or sinusoids. It is typically included to provide 
an estimate of v (in which case M(t) — i)(t — T e ) 3 /6, where T e is the epoch at which the spin properties are defined) 
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or to improve the determination of modeled parameters of interest such as astrometric terms. In these cases <7tn,2 is 
approximated as the quadrature sum of the rms of -Mtn(0 and the TN contained in the post-fit residuals: 

°TS,2 = °TS,M( T ) + 7ff t M 2 TN (t)dt, (A2) 

where tg and t\ = to + T are the starting and ending epochs of the observations. 

For observations labeled S, polynomials have been used to model the timing noise. For observations labeled H, 
harmonically related sinusoids have been used to model the timing noise. For observations labeled STW, a fit including 
v was completed that partially whitens the residuals. In addition to i), the total rms timing noise otot.a-i and the 
whitened rms timing noise aw were reported. In this case, the rms timing noise is 0*tn .m(^) — ^"tot m — ^W' 

A. 2. The Detection Threshold for otn,2 

To determine if timing noise is detected in a time series, we conservatively require that the rms timing noise exceed 
twice the white noise floor (i.e., cttn.2 > ^w), because we suspect that in many timing programs the residuals were 
not examined at sufficient detail to rule out TN below this level. If the measured TN in a time series does not meet 
the threshold we declare the time series to be an upper limit with a value of 2o~w- This is much larger than the formal 
detection threshold, cttn,2 > o~w I VAdof, where -/Vdof is the number of degrees of freedom in the residual TOAs. 

A. 3. Excluded Observations 

We excluded observations of globular cluster pulsars which show acceleration (and significant v) associated with 
these dense enviro nments. We have a lso excluded some additional reports of timing noise from this analysis: 
PS R J1012+5307. ILanee eTaLl (l200lh report a non-zero v that they attribute to TN. However a more recent analysis 
by lLazaridis et al. | (l2009f) that inc ludes the previous data shows no evidence for i) ^ 0. We therefore omit the 
measurement of lLange et al.l (120011) . 

PSR J 17 13+ 07 17. ISnlaverl (120041) report a non-zero v that they attribute to TN. However, a more recent a nalysis by 
iVerbiest et al.l (|2009f) shows no evidence for v ^ 0. We therefore exclude the measurement of iSplaverl (|2004l ). 
PSR B1937+21. We have also excluded a measurement of TN for PSR B1937+21 from this analysis, which is discussed 
in the next section. 



A.4. PSR B1937+21 

The observations used for the analysis of the scaling of the rms TN for PSR B1937+21 (discussed in JO]) are 
presented in Table El We report the observing span of the observations the rms residuals a.^,2, and, when available, 
the number of TOAs used in the analysis. In order to i ncrease the numbe r of independent observations at short 
observing spans, the publicly available residual TOAs from Ka spi et al.l (|1994D were subdivided into shorter observing 
spans of 1, 2, and 4 years. We note that many of the observations contain contemporaneous or common observations, 
and therefore many of the data points are not formally independent. 

All the campaigns included have been corrected for dispersion measure (DM) variations, determined by measuring 
the arrival time difference contemporaneously in two frequency bands. The DM correction is more accurate in recent 
campaigns because of improved observation procedures. In early campaigns, the measurements at two frequencies were 
performed many days apart and changes in interstellar propagation over those times likely increase TOA uncertainty. 
In more recent campaigns, two-frequency observations often occur consecutively during the same observing session or 
simultaneously with dual-frequency receivers. 

We have excluded the 12.5 yr measurements of timing noise in PSR B1937+21 reported in IVerbiest eTaH (f2009h 
because the time series contained a long gap between observations with two different instruments. An arbitrary time 
offset between the two instruments (i.e., a jump) was included the fit. This jump removes a significant amount of TN 
from the residual time series. 

B. ESTIMATING THE FRACTION OF SUITABLE PULSARS 

In this section, we describe the methods for calculating the fraction of pulsars suitable for inclusion in the pulsar 
timing array, J^msp- The fraction of pulsars that show TN below a threshold RMS cttn,* is equivalent to the probability 
of finding a pulsar within the population with those properties, 

/In a t p p 

diner / dMp M (M) / dvdvp v ,i,{v,v)pi nlT {\na\M,v,v : T), (Bl) 

where pm is the probability density for observing fit parameters, where M = {C\, a, /3, 7, 6), as in equation ©; p V y 
is the probability density for the pulsar spin distribution; and pi na - is the PDF for observing a value of TN, assuming 
fixed values for the fit parameters. 
We will assume that the level of TN a is log-normally distributed about the expected value: 



pi n(J (lncr|Ci,a,/3,7,(5, v, z>, T) = y== ex P 



(In a/a) 5 



2S 2 



(B2) 
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This is consistent with the analysis of and the large observed spread in TN for pulsars with similar spin parameters 
v and v. 

To model the p u ,o, we will use the observed distribution of MSPs: 

N p 

PvAviv) = - v p)S{v - Vp)- (B3) 

p P 

For the analysis presen ted here, we use the 64 non-globular cluster MSPs listed in the ATNF pulsar catalogue 
(Ma nchester et alJl2005t l. 

The parameter space PDF pm is modeled using estimates of the best fit values and the fitting covariance matrix C 

PM (M\lnC 1 ,a,p, 7 ,6 T ) = -f f =^ = = exp \(M — M) T C~ 1 (M — M)l , (B4) 

^/(27r) t> det(C7 _i ) L J 

where M = (In C\, a, (3, 7, 5) T and M is a vector containing the best fit parameters to the joint CP+MSP fit. For ease 
of computation, the PDF was approximated using a large number N s — 10 5 of parameter values drawn from equation 
(B4}: 

1 Ns 

Pm = W J2 S(ln d - In C hs )6(a - a s )6(P ~ 0.)5(>y - j s )S(6 - S s ). (B5) 

To calculate P, equations (|B3|) and (IB5|) were substituted into equation (|Bip . 

To calculate the estimation error in P associated with the fitting error in model M, we analyzed the distribution of 
Pi using single realizations of the parameters to calculate pm, i-e., we substituted 

PM ,i = SQxlCx - ]nC lti )5(a ~ a z )5(p - ft)<5( 7 - ^)S{6 - Si) (B6) 

into equation (|B1|) to calculate a number of realizations of the probability Pi. The standard deviation of P, is the 
estimation error. 

C. STRENGTH OF THE GRAVITATIONAL WAVE BACKGROUND 

In this section, we calculate the rms strength of the gravitational wave background in the residuals ctgw,2(? 1 ) for a 
strain amplitude h c (f). The former quantity is the strength of the GW signal accessible to pulsar timing observations 
and is used to estimate the sensitivity of a PTA in <|5l 

The strain amplitude is usually modeled with power-law behavior over the range of / relevant to pulsar timing 
observations, 

h c (f) = A (P\ , (CI) 



and is characterized by an amplitude Aq at frequency /q. 

The power spectrum P r (f) of the TOA fluctuations is related to the the strain amplitude h c (f) by (jHobbs et al.l 
20093) 



The rms of the residuals ctqw 2{T) over a time span T is related to the power spectrum of the perturbations P r {f) 
by 



4w, 2 (T)=/ dfH(f,T)P r (f), 
Jo 



(C3) 



where H (/, T) is a high-pass filter that accounts for power that is removed by model fitting to the arrival times. 

The rms of the residuals <tgw,2 is most accurately determined by simulating the TOA perturbations associated with 
a GWB and then calculating the residual TOAs and ctgw.2- For a gravitational wave background with a = —2/3 

ji \ 5/3 



vgwAT) w 1.3 ns Ao,-x6 i^— J > ( C4 ) 

where Aq,_i5 = Ao/10™ 15 is the characteristic strain at /o = 1 yr _1 and that the high pass filter is approximately 
unity for / > 1/T and zero for / < 1/T. We note the scaling with T is similar to that for a random walk in frequency 
(RWi) for which cttn,2 oc T 3 / 2 is expected. 

The GWB was simulated using a large number of gravitational waves with wave amplitude, frequency, phase, 
polarization, and propagating direction randomly selected from appropriate PDFs. In particular, the wave amplitude 
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and frequency were selected from distributions consistent with Equation (|C1|) using appropriate lower fe and upper 
frequency cut-offs. Equation (|C4[) is valid for all ft <C 1/T. In simulations with = 1/(10T) we find that cgw 2(T) « 
1/25 a GW (T). 
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Table 5 

Previous Timing Campaigns 



Reference Objects 


T typ 


Observation 




(yr) 


Type 


Canonical Pulsars (CPs) 



Hclfand ct al. (1980) 
Cordes fc r^nTTigSS) 
D'Alessandro et al" Tl993) 
Arzoumanian ct al. 
Foster et al. (1994) 
D'Alessandro et al. 
Hobbs et al. (2004) 
Zou et al. 72004) 
Champion et al. (2005) 
Kaplan & van Kerkwjjk 
Chukwude (2007) 
McLaughlin et al. (2009' 
Hobbs ct al. (2010) 



37 
27 
45 
96 
1 

45 

346, 27 MSP 
2 

15, 1 MSP 
1 

27 

6 RRAT 
346, 30 MSP 



4 
10 
4 
3 
6 
7 
20 
1 
2 
11 
10 
6 
25 



TW 
TN 
TW 
TW 

T* 
TN 
TSW 

T 

T 

T 

S 

T 
TSW 



Kaspi ct al. (1994) 
Bell et al. Tl997) 
Lommen (2002) 
Hotan et aT7j006) 
Ord et al.720*06) 
Demorest (2007) 
Verbiest et al. (20081 
Hobbs et aTT 2009b) 
Lazaridis ct al. (2009) 
Verbiest (2009) 



Millisecond Pulsars (MSPs 



2 
4 

4, 2 

15 

1 

15 
1 

20 
1 

19 



CP 



2 
3 

10 
2 
1 
2 

10 
4 

14 

10 



S 
T 
S 
S 
S 
T 
T 
T 
T 
H 



Woods ct al. (2000) 
Kaspi et al. (2001) 
Gavriil K aspi (2002) 
Gotthelf et al. (2002) 
Camilo et al. (20077 
den Hartoe et al. (2008 ) 
Dib et al. (2008) 



Magnctars (MAGs) 



Note. — Timing campaigns used in this analysis. We 
list campaigns, class of objects studied, typical observing 
length Ttyp, and reported observable. Horizontal lines di- 
vide campaigns that study predominantly canonical pulsars 
(CPs, 1/6 s _1 < v < 50 s _1 ), millisecond pulsars (MSPs; 
v > 50 s" 1 ), and magnetars (MAGs, u < 1/6 s). The 
rotating radio transients (RRATs) are rotating neutron 
stars that show spin properties similar to that of canon- 
ical pulsars. The reported observation types are: TW, to- 
tal rms residuals and whitened residuals; TN, timing noise, 
S, higher order spindown terms (e.g., i>), STW, spindown 
terms, total rms (after measurement of spin-down terms), 
and whitened residuals; H, harmonically related sinusoids 
and whitened residuals; T, only upper limits on levels of 
timing noise are reported; T*, only the total rms is re- 
ported but timing noise is dominant contribution to rms 
residuals. 
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Table 6 

Timing Noise in PSR B1937+21 



T 

(y) 


CT RMS 

(/is) 


Ntqa 


Rcf. 


1.0 


0.15 




1 


1.0 


0.23 


22 


2 


1.0 


0.24 


19 


2 


1.0 


0.32 


16 


2 


1.0 


0.24 


16 


2 


1.0 


0.21 


18 


2 


1.0 


0.21 


11 


2 


1.0 


0.19 


23 


2 


1.2 


0.21 


13 


2 


1.5 


0.17 




3 


2.0 


0.25 


47 


2 


2.0 


0.29 


38 


2 


2.0 


0.20 


38 


2 


2.2 


0.20 


38 


2 


2.3 


0.19 




1 


2.4 


0.20 


231 


5 


2.7 


0.32 


85 


(5 


4.0 


0.20 


39 


7 


4.0 


0.30 




8 


1.0 


0.41 


85 


2 


1.2 


0.49 


80 


2 


1.1 


0.27 


168 


9 


8.2 


0.94 


440 


2 


10.0 


1.3 




10 


16.8 


9.3 


387 


10 


20.0 


112.0 


400 


11 


23.3 


24.2 


588 


12 


24.0 


27.4 
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Note. — Root mean square times of arrival for PSR 
B1937+21 for different observing programs. Column T 
shows the observing span, column o-rms shows the to- 
tal rms residuals, column ./Vtoa shows the number of 
times of arrival included in the analysis, and column Rcf. 
shows the numbe red r eferences. The ref eren ces are: (1) 
Manch ester (2008); (2) Kaspi ct al. (1994); (3) Man chester] 
(2009); (4) You ct al. (2007); (5) Hotan ct al. (2006); (6) 
Dcmorcst (2007); (7) Dcmorcst (2008); (8) Thcrcau (2008); 
(9) Vcrbicst (2009); (10) Lommcn (2002); (11) Hobbs ct al. 
(2004); (12) IVerbiest et al.l (120091) ; (13)[Janssen (200|)^ 



